Search CORE

6 research outputs found

A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version)

Author: Poke Marius
Glass Colin W.
Publication venue
Publication date: 05/02/2019
Field of study

Many distributed systems work on a common shared state; in such systems, distributed agreement is necessary for consistency. With an increasing number of servers, these systems become more susceptible to single-server failures, increasing the relevance of fault-tolerance. Atomic broadcast enables fault-tolerant distributed agreement, yet it is costly to solve. Most practical algorithms entail linear work per broadcast message. AllConcur -- a leaderless approach -- reduces the work, by connecting the servers via a sparse resilient overlay network; yet, this resiliency entails redundancy, limiting the reduction of work. In this paper, we propose AllConcur+, an atomic broadcast algorithm that lifts this limitation: During intervals with no failures, it achieves minimal work by using a redundancy-free overlay network. When failures do occur, it automatically recovers by switching to a resilient overlay network. In our performance evaluation of non-failure scenarios, AllConcur+ achieves comparable throughput to AllGather -- a non-fault-tolerant distributed agreement algorithm -- and outperforms AllConcur, LCR and Libpaxos both in terms of throughput and latency. Furthermore, our evaluation of failure scenarios shows that AllConcur+'s expected performance is robust with regard to occasional failures. Thus, for realistic use cases, leveraging redundancy-free distributed agreement during intervals with no failures improves performance significantly.Comment: Overview: 24 pages, 6 sections, 3 appendices, 8 figures, 3 tables. Modifications from previous version: extended the evaluation of AllConcur+ with a simulation of a multiple datacenters deploymen

arXiv.org e-Print Archive

FigShare

Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

Author: Adya Atul
Aguilera Marcos
Aleksandar Dragojević
Anwar Ali
Baker Jason
Balakrishnan Mahesh
Behrens Jonathan
Brian
Bronson Nathan
Burrows Mike
Consistent
DeCandia Giuseppe
Gray Jim
Hunt Patrick
István Zsolt
Jha Sagar
Jin Xin
Kalia Anuj
Leslie Lamport
Li Jialin
Lim Hyeontaek
Lu Yuanwei
Mao Yanhua
Nightingale Edmund B.
Ongaro Diego
Poke Marius
Reed Benjamin
Renesse Robbert Van
Terrace Jeff
van Renesse Robbert
Wei Michael
Woo Shinae
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/01/2020
Field of study

Today's datacenter applications are underpinned by datastores that are responsible for providing availability, consistency, and performance. For high availability in the presence of failures, these datastores replicate data across several nodes. This is accomplished with the help of a reliable replication protocol that is responsible for maintaining the replicas strongly-consistent even when faults occur. Strong consistency is preferred to weaker consistency models that cannot guarantee an intuitive behavior for the clients. Furthermore, to accommodate high demand at real-time latencies, datastores must deliver high throughput and low latency. This work introduces Hermes, a broadcast-based reliable replication protocol for in-memory datastores that provides both high throughput and low latency by enabling local reads and fully-concurrent fast writes at all replicas. Hermes couples logical timestamps with cache-coherence-inspired invalidations to guarantee linearizability, avoid write serialization at a centralized ordering point, resolve write conflicts locally at each replica (hence ensuring that writes never abort) and provide fault-tolerance via replayable writes. Our implementation of Hermes over an RDMA-enabled reliable datastore with five replicas shows that Hermes consistently achieves higher throughput than state-of-the-art RDMA-based reliable protocols (ZAB and CRAQ) across all write ratios while also significantly reducing tail latency. At 5% writes, the tail latency of Hermes is 3.6X lower than that of CRAQ and ZAB.Comment: Accepted in ASPLOS 202

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes

Author: Alexandru Calotoiu
Felix Wolf
Marius Poke
Torsten Hoefler
Publication venue
Publication date
Field of study

Many parallel applications suffer from latent performance limitations that may prevent them from scaling to larger machine sizes. Often, such scalability bugs manifest themselves only when an attempt to scale the code is actually being made—a point where remediation can be difficult. However, creating analytical performance models that would allow such issues to be pinpointed earlier is so laborious that application developers attempt it at most for a few selected kernels, running the risk of missing harmful bottlenecks. In this paper, we show how both coverage and speed of this scalability analysis can be substantially improved. Generating an empirical performance model automatically for each part of a parallel program, we can easily identify those parts that will reduce performance at larger core counts. Using a climate simulation as an example, we demonstrate that scalability bugs are not confined to those routines usually chosen as kernels

CiteSeerX

Crossref

HovercRaft: Achieving Scalability and Fault-tolerance for microsecond-scale Datacenter Services

Author: Baker Jason
Balakrishnan Mahesh
Barroso Luiz André
Benevides Bezerra Carlos Eduardo
Brewer Eric A.
Brian
Cary
Chang Chia-Chen
Dang Huynh Tu
Dang Huynh Tu
DeCandia Giuseppe
Dragojevic Aleksandar
Esposito Emanuele Giuseppe
Frans Kaashoek M.
Gao Peter Xiang
Guo Zhenyu
Han Sangjin
Hunt Patrick
István Zsolt
Jeong Eunyoung
Jim
Jin Xin
Kaffes Kostis
Kalia Anuj
Kapritsos Manos
Kogias Marios
Kogias Marios
Kulkarni Chinmay
Lamport Leslie
Li Jialin
Mao Yanhua
Michael
Mitchell Christopher
Moraru Iulian
Mu Shuai
Nishtala Rajesh
Ongaro Diego
Ousterhout Amy
Pascal
Poke Marius
Ports Dan R. K.
Prekas George
Psaroudakis Iraklis
Seo
Tokusashi Yuta
Weil Sage A.
Yang Jian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/03/2020
Field of study

Cloud platform services must simultaneously be scalable, meet low tail latency service-level objectives, and be resilient to a combination of software, hardware, and network failures. Replication plays a fundamental role in meeting both the scalability and the fault-tolerance requirement, but is subject to opposing requirements: (1) scalability is typically achieved by relaxing consistency; (2) fault-tolerance is typically achieved through the consistent replication of state machines. Adding nodes to a system can therefore either in- crease performance at the expense of consistency, or increase resiliency at the expense of performance. We propose HovercRaft, a new approach by which adding nodes increases both the resilience and the performance of general-purpose state-machine replication. We achieve this through an extension of the Raft protocol that carefully eliminates CPU and I/O bottlenecks and load balances requests. Our implementation uses state-of-the-art kernel-bypass techniques, datacenter transport protocols, and in-network programmability to deliver up to 1 million operations/second for clusters of up to 9 nodes, linear speedup over unreplicated configuration for selected workloads, and a 4× speedup for the YCSBE-E benchmark running on Redis over an unreplicated deployment

Infoscience - École polytechnique fédérale de Lausanne

Crossref